HP2PC: Scalable Hierarchically-Distributed Peer-to-Peer Clustering

نویسندگان

  • Khaled M. Hammouda
  • Mohamed S. Kamel
چکیده

In distributed data mining models, adopting a flat node distribution model can affect scalability. To address the problem of modularity, flexibility and scalability, we propose a hierarchically-distributed peer-to-peer architecture and algorithm for data clustering (HP2PC). The architecture is based on a multi-layer overlay network of peer neighborhoods. Supernodes, which act as representatives of neighborhoods, are recursively grouped to form higher level neighborhoods. Peers at a certain level of the hierarchy cooperate within their respective neighborhoods to perform clustering. Using this model, we can partition the clustering problem in a modular way, solve each part individually, then successively combine clusterings up the hierarchy where increasingly global solutions are computed. The algorithm was applied to a distributed document clustering problem and achieved decent speedup with comparable clustering quality to the centralized approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P2P Network Trust Management Survey

Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...

متن کامل

Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

This paper proposes a scalable, local privacy-preserving algorithm for distributed peer-to-peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions...

متن کامل

Connectivity Based Node Clustering in Decentralized Peer-to-Peer Networks

Connectivity based node clustering has wide ranging applications in decentralized Peer-to-Peer (P2P) networks such as P2P file sharing systems, mobile ad-hoc networks, P2P sensor networks and so forth. This paper describes a Connectivity-based Distributed Node Clustering scheme (CDC). This scheme presents a scalable and an efficient solution for discovering connectivity based clusters in peer n...

متن کامل

A Quorum Based Distributed Mutual Exclusion Algorithm for Multi-Level Clustered Network Architecture

Different permission-based algorithms have been proposed for the solution of the Mutual Exclusion problems. With the emergence of peer-to-peer computing, the distributed applications spread over a large number of nodes. Cluster-based solutions are scalable for large number of participants. Some algorithms are proposed using cluster topology. But the number of participating nodes is increasing e...

متن کامل

A Scalable Semantic Indexing Framework for Peer-to-Peer Information Retrieval

The exponential growth of data demands scalable and adaptable infrastructures for indexing and searching a huge amount of data sources with high accuracy and efficiency. Existing centralized search engines are not scalable and suffer from single-point-offailures. The recent work on P2P index construction partitions the document vectors either randomly or statically, making it difficult to trade...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007